median_absolute_error (MedAE)#

Median Absolute Error (MedAE) is a regression metric that summarizes the typical absolute prediction error by taking the median of absolute residuals.

Compared to MAE (mean absolute error), MedAE is much more robust to outliers: as long as fewer than 50% of samples are extreme, the median barely moves.


Learning goals#

  • Define MedAE precisely (math + units)

  • Build intuition with Plotly visuals (median vs mean, outliers)

  • Implement MedAE from scratch in NumPy (including multi-output)

  • Use MedAE as an optimization objective for a simple model (derivative-free search)

  • Know pros/cons and when to use it

import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

from sklearn.metrics import mean_absolute_error, mean_squared_error, median_absolute_error

pio.templates.default = 'plotly_white'
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=4, suppress=True)

rng = np.random.default_rng(42)

1) Definition#

Let \(y \in \mathbb{R}^n\) be targets and \(\hat{y} \in \mathbb{R}^n\) predictions.

Define absolute errors:

\[ a_i = |y_i - \hat{y}_i| \]

Then the median absolute error is:

\[ \mathrm{MedAE}(y, \hat{y}) = \operatorname{median}(a_1, \dots, a_n) = \operatorname{median}_{i=1,\dots,n}\; |y_i - \hat{y}_i| \]

If we sort the absolute errors \(a_{(1)} \le \dots \le a_{(n)}\):

  • if \(n\) is odd, \(\operatorname{median}(a)=a_{((n+1)/2)}\)

  • if \(n\) is even, \(\operatorname{median}(a)=\frac{a_{(n/2)} + a_{(n/2+1)}}{2}\) (NumPy / scikit-learn convention)

In quantile notation:

\[ \mathrm{MedAE} = Q_{0.5}(|y-\hat{y}|) \]

Multi-output targets#

If \(y \in \mathbb{R}^{n \times m}\) (m outputs), compute MedAE per output:

\[ \mathrm{MedAE}_j = \operatorname{median}_{i=1,\dots,n} |y_{ij}-\hat{y}_{ij}| \]

and then aggregate across outputs (uniform average or custom weights).

Units#

MedAE has the same units as the target \(y\) (e.g. dollars, °C).

# A tiny example with an extreme outlier
y_true = np.array([2.0, 0.0, 4.0, 1.0, 100.0])
y_pred = np.array([1.5, 0.2, 3.0, 2.0, 0.0])

abs_errors = np.abs(y_true - y_pred)

medae_np = float(np.median(abs_errors))
medae_sklearn = median_absolute_error(y_true, y_pred)

print('y_true     =', y_true)
print('y_pred     =', y_pred)
print('|error|    =', abs_errors)
print('MedAE (NumPy)   =', medae_np)
print('MedAE (sklearn) =', medae_sklearn)
print('MAE  (compare)  =', mean_absolute_error(y_true, y_pred))
print('RMSE (compare)  =', np.sqrt(mean_squared_error(y_true, y_pred)))
y_true     = [  2.   0.   4.   1. 100.]
y_pred     = [1.5 0.2 3.  2.  0. ]
|error|    = [  0.5   0.2   1.    1.  100. ]
MedAE (NumPy)   = 1.0
MedAE (sklearn) = 1.0
MAE  (compare)  = 20.54
RMSE (compare)  = 44.726479852543726
idx = np.arange(len(y_true))
mean_abs = float(np.mean(abs_errors))

fig = go.Figure()
fig.add_trace(go.Bar(x=idx, y=abs_errors, name='|y - ŷ|'))
fig.add_hline(
    y=medae_np,
    line_dash='dash',
    line_color='black',
    annotation_text=f'median={medae_np:.2f}',
)
fig.add_hline(
    y=mean_abs,
    line_dash='dot',
    line_color='crimson',
    annotation_text=f'mean={mean_abs:.2f}',
)
fig.update_layout(
    title='Absolute errors: median vs mean',
    xaxis_title='sample index',
    yaxis_title='|error|',
)
fig.show()

2) Intuition: “typical” error (the 50th percentile)#

MedAE does not average errors. It asks:

“What is the error magnitude that half of my predictions beat?”

Equivalently, MedAE is the 50th percentile (median) of the absolute-error distribution.

  • If a few predictions are catastrophically wrong, they usually land in the upper tail of \(|\mathrm{error}|\).

  • The median ignores that tail unless it becomes more than half the data.

n = 400
y_true = rng.normal(0, 1, size=n)
y_pred = y_true + rng.normal(0, 0.4, size=n)

# Inject a few extreme misses
out_idx = rng.choice(n, size=8, replace=False)
y_pred[out_idx] += rng.normal(0, 10.0, size=out_idx.size)

abs_errors = np.abs(y_true - y_pred)
med = float(np.median(abs_errors))
mean_ = float(np.mean(abs_errors))

fig = px.histogram(abs_errors, nbins=40, title='Distribution of |error|')
fig.add_vline(x=med, line_dash='dash', line_color='black', annotation_text=f'median={med:.2f}')
fig.add_vline(x=mean_, line_dash='dot', line_color='crimson', annotation_text=f'mean={mean_:.2f}')
fig.update_layout(xaxis_title='|error|', yaxis_title='count')
fig.show()

3) Outliers: why MedAE is robust#

Because the median has a 50% breakdown point, MedAE can tolerate large outliers as long as they affect < 50% of samples:

  • With 10% bad points, making those points 10× worse barely changes MedAE.

  • MAE and RMSE do change because they average (and RMSE squares) the tail.

Let’s simulate that.

n = 300
y_true = rng.normal(0, 1, size=n)
y_pred_base = y_true + rng.normal(0, 0.3, size=n)

outlier_frac = 0.10
k = int(outlier_frac * n)
out_idx = rng.choice(n, size=k, replace=False)
signs = rng.choice([-1.0, 1.0], size=k)

magnitudes = np.linspace(0, 25, 60)

medae_vals = []
mae_vals = []
rmse_vals = []

for m in magnitudes:
    y_pred = y_pred_base.copy()
    y_pred[out_idx] += signs * m

    medae_vals.append(float(np.median(np.abs(y_true - y_pred))))
    mae_vals.append(mean_absolute_error(y_true, y_pred))
    rmse_vals.append(np.sqrt(mean_squared_error(y_true, y_pred)))

fig = go.Figure()
fig.add_trace(go.Scatter(x=magnitudes, y=medae_vals, mode='lines', name='MedAE (median)'))
fig.add_trace(go.Scatter(x=magnitudes, y=mae_vals, mode='lines', name='MAE (mean)'))
fig.add_trace(go.Scatter(x=magnitudes, y=rmse_vals, mode='lines', name='RMSE (sqrt MSE)'))
fig.update_layout(
    title=f'Effect of {outlier_frac:.0%} extreme outliers on common metrics',
    xaxis_title='outlier magnitude added to predictions',
    yaxis_title='metric value (same units as y)',
)
fig.show()
def medae_with_fraction_outliers(frac, *, seed_offset=0):
    local_rng = np.random.default_rng(123 + seed_offset)
    k = int(frac * n)
    idx = local_rng.choice(n, size=k, replace=False)
    signs = local_rng.choice([-1.0, 1.0], size=k)

    vals = []
    for m in magnitudes:
        y_pred = y_pred_base.copy()
        y_pred[idx] += signs * m
        vals.append(float(np.median(np.abs(y_true - y_pred))))
    return vals


medae_10 = medae_with_fraction_outliers(0.10, seed_offset=0)
medae_60 = medae_with_fraction_outliers(0.60, seed_offset=1)

fig = go.Figure()
fig.add_trace(go.Scatter(x=magnitudes, y=medae_10, mode='lines', name='MedAE (10% outliers)'))
fig.add_trace(go.Scatter(x=magnitudes, y=medae_60, mode='lines', name='MedAE (60% outliers)'))
fig.update_layout(
    title='Median breakdown: once >50% are outliers, MedAE moves too',
    xaxis_title='outlier magnitude added to predictions',
    yaxis_title='MedAE',
)
fig.show()

4) NumPy implementation (from scratch)#

scikit-learn’s median_absolute_error supports:

  • 1D targets: y.shape == (n_samples,)

  • multi-output targets: y.shape == (n_samples, n_outputs)

  • aggregation via multioutput:

    • 'raw_values' → one MedAE per output

    • 'uniform_average' → average over outputs (default)

    • array-like weights → weighted average over outputs

Unlike MAE/MSE, MedAE does not accept sample_weight in scikit-learn (there is no universally-agreed “weighted median” convention).

def _as_2d(y):
    y = np.asarray(y)
    if y.ndim == 1:
        return y.reshape(-1, 1)
    if y.ndim == 2:
        return y
    raise ValueError(f'y must be 1D or 2D, got shape={y.shape}')


def median_absolute_error_np(y_true, y_pred, *, multioutput='uniform_average'):
    """NumPy implementation compatible with sklearn.metrics.median_absolute_error.

    Parameters
    ----------
    y_true, y_pred : array-like of shape (n_samples,) or (n_samples, n_outputs)
    multioutput : {'raw_values', 'uniform_average'} or array-like of shape (n_outputs,)

    Returns
    -------
    float or np.ndarray
    """
    y_true_2d = _as_2d(y_true)
    y_pred_2d = _as_2d(y_pred)

    if y_true_2d.shape != y_pred_2d.shape:
        raise ValueError(
            f'y_true and y_pred must have the same shape, got {y_true_2d.shape} vs {y_pred_2d.shape}'
        )

    abs_errors = np.abs(y_true_2d - y_pred_2d)
    med_per_output = np.median(abs_errors, axis=0)

    if isinstance(multioutput, str):
        if multioutput == 'raw_values':
            return med_per_output
        if multioutput == 'uniform_average':
            return float(np.mean(med_per_output))
        raise ValueError(
            "multioutput must be 'raw_values', 'uniform_average', or an array of output weights"
        )

    weights = np.asarray(multioutput, dtype=float)
    if weights.shape != med_per_output.shape:
        raise ValueError(
            f'multioutput weights must have shape {med_per_output.shape}, got {weights.shape}'
        )
    return float(np.average(med_per_output, weights=weights))
# 1D check
y_true = rng.normal(size=50)
y_pred = y_true + rng.normal(scale=0.3, size=50)

print('MedAE (np)     =', median_absolute_error_np(y_true, y_pred))
print('MedAE (sklearn)=', median_absolute_error(y_true, y_pred))

# Multi-output check
y_true = rng.normal(size=(80, 3))
y_pred = y_true + rng.normal(scale=0.5, size=(80, 3))

print('\nraw_values (np)     =', median_absolute_error_np(y_true, y_pred, multioutput='raw_values'))
print('raw_values (sklearn)=', median_absolute_error(y_true, y_pred, multioutput='raw_values'))

weights = np.array([0.2, 0.3, 0.5])
print('\nweighted (np)     =', median_absolute_error_np(y_true, y_pred, multioutput=weights))
print('weighted (sklearn)=', median_absolute_error(y_true, y_pred, multioutput=weights))
MedAE (np)     = 0.22811342804441603
MedAE (sklearn)= 0.22811342804441603

raw_values (np)     = [0.3048 0.3299 0.3327]
raw_values (sklearn)= [0.3048 0.3299 0.3327]

weighted (np)     = 0.3263042693906659
weighted (sklearn)= 0.3263042693906659

5) Using MedAE to optimize a model#

MedAE is great as an evaluation metric, but it’s awkward as a training loss:

  • it uses a median (an order statistic)

  • it is non-smooth and has plateaus

  • for many models, the objective is not convex

Still, you can optimize it in low-level ways (typically derivative-free). We’ll do two simple cases:

  1. the best constant predictor under MedAE (closed form)

  2. fitting a line by directly minimizing MedAE with a randomized search

5.1 Constant predictor: minimize MedAE(y, c)#

Consider the constant model \(\hat{y}_i = c\) for all \(i\).

The objective is:

\[ J(c) = \mathrm{MedAE}(y, c) = \operatorname{median}_{i} |y_i - c| \]

Interpretation in 1D:

  • Let \(r \ge 0\).

  • The condition \(J(c) \le r\) means at least half of the points lie inside the interval \([c-r,\, c+r]\).

So minimizing \(J(c)\) is equivalent to:

Find the shortest interval that contains at least half the samples, then place \(c\) in the middle of that interval.

Algorithm (odd \(n\); NumPy/scikit-learn’s median uses the same idea for even \(n\)):

  1. Sort \(y_{(1)} \le \dots \le y_{(n)}\)

  2. Let \(k = \lceil n/2 \rceil\)

  3. Slide a window of size \(k\) and compute widths: $\( w_i = y_{(i+k-1)} - y_{(i)}, \quad i = 1,\dots,n-k+1 \)$

  4. Pick the tightest window \(i^* = \arg\min_i w_i\)

  5. Set: $\( c^* = \frac{y_{(i^*)} + y_{(i^*+k-1)}}{2} \)$

This is different from MAE: minimizing MAE over constants yields the (usual) median of \(y\), while minimizing MedAE finds the center of the densest half of the data (a “mode-like” behavior).

def best_constant_medae(y):
    """Return (c_star, MedAE(y, c_star)) for the constant predictor y_hat=c.

    Uses the shortest half-sample interval (window of size ceil(n/2)).
    """
    y = np.asarray(y, dtype=float).ravel()
    if y.size == 0:
        raise ValueError('y must be non-empty')

    y_sorted = np.sort(y)
    n = y_sorted.size
    k = n // 2 + 1  # ceil(n/2)

    widths = y_sorted[k - 1 :] - y_sorted[: n - k + 1]
    i = int(np.argmin(widths))

    c_star = 0.5 * (y_sorted[i] + y_sorted[i + k - 1])
    medae_star = float(np.median(np.abs(y - c_star)))
    return c_star, medae_star


# A two-cluster example (dense cluster on the right)
y = np.array([0, 1, 2, 3, 100, 101, 102, 103, 104], dtype=float)

c_median = float(np.median(y))  # minimizer of MAE over constants
c_star, medae_star = best_constant_medae(y)

c_grid = np.linspace(y.min() - 5, y.max() + 5, 800)
loss = np.array([float(np.median(np.abs(y - c))) for c in c_grid])

print(f'median(y) (MAE-optimal constant) = {c_median:.2f}')
print(f'c* (MedAE-optimal constant)      = {c_star:.2f}')
print(f'MedAE(y, c*) = {medae_star:.2f}')
print(f'MedAE(y, median(y)) = {float(np.median(np.abs(y - c_median))):.2f}')

fig = go.Figure()
fig.add_trace(go.Scatter(x=c_grid, y=loss, mode='lines', name='MedAE(y, c)'))
fig.add_vline(
    x=c_median,
    line_dash='dot',
    line_color='gray',
    annotation_text=f'median(y)={c_median:.1f}',
)
fig.add_vline(
    x=c_star,
    line_dash='dash',
    line_color='black',
    annotation_text=f'c*={c_star:.1f}',
)
fig.update_layout(
    title='Constant predictor: minimizing MedAE targets the densest half of y',
    xaxis_title='constant prediction c',
    yaxis_title='MedAE(y, c)',
)
fig.show()
median(y) (MAE-optimal constant) = 100.00
c* (MedAE-optimal constant)      = 102.00
MedAE(y, c*) = 2.00
MedAE(y, median(y)) = 4.00

5.2 Fitting a line by directly minimizing MedAE#

Now consider a simple linear model:

\[ \hat{y} = b_0 + b_1 x \]

If we train by MedAE:

\[ J(b_0, b_1) = \operatorname{median}_i |y_i - (b_0 + b_1 x_i)| \]

This surface is typically non-smooth (lots of flat regions), so we’ll use a very simple randomized hill-climbing optimizer:

  • start from an initial guess (we’ll use OLS as a warm start)

  • randomly propose small changes to \((b_0, b_1)\)

  • keep the proposal only if it reduces MedAE

  • slowly shrink the proposal scale

This isn’t production-grade optimization, but it shows how the metric can be used as a direct objective.

n = 180
x = rng.uniform(-2, 2, size=n)

b0_true, b1_true = 1.0, 2.0
y_clean = b0_true + b1_true * x + rng.normal(0, 0.3, size=n)

y = y_clean.copy()

n_out = int(0.15 * n)
out_idx = rng.choice(n, size=n_out, replace=False)
y[out_idx] += rng.normal(0, 8.0, size=n_out)

X = np.column_stack([np.ones_like(x), x])
b0_ols, b1_ols = np.linalg.lstsq(X, y, rcond=None)[0]

print(f'true params: b0={b0_true:.3f}, b1={b1_true:.3f}')
print(f'OLS  params: b0={b0_ols:.3f}, b1={b1_ols:.3f}')

is_outlier = np.zeros(n, dtype=bool)
is_outlier[out_idx] = True
colors = np.where(is_outlier, 'crimson', 'royalblue')

fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=x,
        y=y,
        mode='markers',
        name='observations',
        marker=dict(color=colors, size=7, opacity=0.85),
    )
)
fig.update_layout(title='Synthetic regression data with outliers', xaxis_title='x', yaxis_title='y')
fig.show()
true params: b0=1.000, b1=2.000
OLS  params: b0=1.270, b1=2.012
def medae_for_line(x, y, b0, b1):
    y_hat = b0 + b1 * x
    return float(np.median(np.abs(y - y_hat)))


def fit_line_medae_hillclimb(
    x,
    y,
    *,
    b0_init=0.0,
    b1_init=0.0,
    n_steps=8000,
    step0=1.5,
    decay=0.9995,
):
    b0, b1 = float(b0_init), float(b1_init)
    best = medae_for_line(x, y, b0, b1)

    step_b0 = float(step0)
    step_b1 = float(step0)

    history = {
        'step': [],
        'b0': [],
        'b1': [],
        'medae': [],
    }

    for t in range(n_steps):
        cand_b0 = b0 + step_b0 * rng.normal()
        cand_b1 = b1 + step_b1 * rng.normal()

        cand = medae_for_line(x, y, cand_b0, cand_b1)
        if cand < best:
            b0, b1, best = cand_b0, cand_b1, cand

        history['step'].append(t)
        history['b0'].append(b0)
        history['b1'].append(b1)
        history['medae'].append(best)

        step_b0 *= decay
        step_b1 *= decay

    return (b0, b1), history


(b0_medae, b1_medae), hist = fit_line_medae_hillclimb(
    x,
    y,
    b0_init=b0_ols,
    b1_init=b1_ols,
    n_steps=8000,
    step0=1.5,
    decay=0.9995,
)


def summarize(b0, b1):
    y_hat = b0 + b1 * x
    return {
        'MedAE': float(np.median(np.abs(y - y_hat))),
        'MAE': mean_absolute_error(y, y_hat),
        'RMSE': np.sqrt(mean_squared_error(y, y_hat)),
    }


print('OLS metrics:', summarize(b0_ols, b1_ols))
print('MedAE-optimized metrics:', summarize(b0_medae, b1_medae))

fig = px.line(
    y=hist['medae'],
    title='Hill-climbing: best MedAE so far',
    labels={'index': 'step', 'y': 'best MedAE'},
)
fig.show()

x_line = np.linspace(x.min(), x.max(), 200)

fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=y, mode='markers', name='data', marker=dict(color=colors, size=7, opacity=0.85)))
fig.add_trace(
    go.Scatter(
        x=x_line,
        y=b0_ols + b1_ols * x_line,
        mode='lines',
        name='OLS (min MSE)',
        line=dict(color='black', dash='dot'),
    )
)
fig.add_trace(
    go.Scatter(
        x=x_line,
        y=b0_medae + b1_medae * x_line,
        mode='lines',
        name='Min MedAE (hill-climb)',
        line=dict(color='seagreen'),
    )
)
fig.update_layout(title='Line fit: OLS vs directly minimizing MedAE', xaxis_title='x', yaxis_title='y')
fig.show()
OLS metrics: {'MedAE': 0.3255882359758968, 'MAE': 1.2562381006417402, 'RMSE': 3.206246485594046}
MedAE-optimized metrics: {'MedAE': 0.23791194121513426, 'MAE': 1.198741198055658, 'RMSE': 3.212982820690415}
# Visualize the MedAE surface over (b0, b1) with the (downsampled) search path

b0_grid = np.linspace(b0_ols - 4.0, b0_ols + 4.0, 160)
b1_grid = np.linspace(b1_ols - 4.0, b1_ols + 4.0, 160)

B0, B1 = np.meshgrid(b0_grid, b1_grid)
residuals = y[None, None, :] - (B0[:, :, None] + B1[:, :, None] * x[None, None, :])
Z = np.median(np.abs(residuals), axis=2)

stride = max(1, len(hist['step']) // 150)
b0_path = np.array(hist['b0'])[::stride]
b1_path = np.array(hist['b1'])[::stride]

fig = go.Figure()
fig.add_trace(
    go.Contour(
        x=b0_grid,
        y=b1_grid,
        z=Z,
        contours_coloring='heatmap',
        colorbar=dict(title='MedAE'),
    )
)
fig.add_trace(go.Scatter(x=b0_path, y=b1_path, mode='lines', name='search path', line=dict(color='white')))
fig.add_trace(
    go.Scatter(
        x=[b0_ols],
        y=[b1_ols],
        mode='markers',
        name='OLS start',
        marker=dict(color='black', symbol='x', size=10),
    )
)
fig.add_trace(
    go.Scatter(
        x=[b0_medae],
        y=[b1_medae],
        mode='markers',
        name='best found',
        marker=dict(color='lime', size=10),
    )
)
fig.update_layout(
    title='MedAE surface over (b0, b1): non-smooth & plateau-heavy',
    xaxis_title='b0 (intercept)',
    yaxis_title='b1 (slope)',
)
fig.show()

6) Pros, cons, and when to use MedAE#

Pros#

  • Robust to outliers: the median ignores extreme errors until they affect >50% of samples.

  • Interpretable units: same unit as the target.

  • Measures typical performance (50th percentile), which can match user-facing targets (“median miss”).

Cons / pitfalls#

  • Hides tail risk: two models can have the same MedAE while one has much worse rare failures.

  • Hard to optimize directly: the median makes the objective non-smooth with plateaus; gradient methods don’t apply cleanly.

  • No sample weights in sklearn: “weighted median” is ambiguous and not provided.

  • Can be unstable on small datasets (the median jumps when a single point crosses the middle).

Good use cases#

  • Regression with heavy-tailed noise / occasional gross outliers (sensor glitches, messy labels).

  • When “typical error” matters more than worst-case or average (e.g., median ETA error).

  • As a validation metric alongside MAE/RMSE/max error to capture multiple aspects of error.

7) Exercises#

  1. Create a dataset where 40% of points are extreme outliers. Verify MedAE barely moves, then repeat with 60% and observe the breakdown.

  2. Implement a weighted median absolute error (choose a clear weighted-median convention) and test it on toy data.

  3. Compare OLS vs MedAE-optimized line on data with asymmetric noise (one-sided outliers). Which metric better matches what you care about?

References#

  • scikit-learn docs: https://scikit-learn.org/stable/modules/model_evaluation.html#median-absolute-error

  • Robust statistics (median, breakdown point): https://en.wikipedia.org/wiki/Robust_statistics

  • Rousseeuw, Leroy — Robust Regression and Outlier Detection (least-median ideas)